On the Two-level Hybrid Clustering Algorithm

نویسندگان

  • Eng Yeow Cheu
  • Chee Keong Kwoh
  • Zonglin Zhou
چکیده

In this paper, we design the hybrid clustering algorithms, which involve two level clustering. At each of the levels, users can select the k-means, hierarchical or SOM clustering techniques. Unlike the existing cluster analysis techniques, the hybrid clustering approach developed here represents the original data set using a smaller set of prototype vectors (cluster means), which allows efficient use of a clustering algorithm to divide the prototype into groups at the first level. Since the clustering at the first level provides data abstraction first, it reduces the number of samples for the second level clustering. The reduction of the number of samples, hence, the reduction of computational cost is especially important when hierarchical clustering is used in the second stage. The prototypes clustered at the first level are local averages of the data and therefore less sensitive to random variations than the original data. The empirical evaluation of the two-level hybrid clustering algorithms is made at four data sets

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Hybrid Grey based Two Steps Clustering and Firefly Algorithm for Portfolio Selection

Considering the concept of clustering, the main idea of the present study is based on the fact that all stocks for choosing and ranking will not be necessarily in one cluster. Taking the mentioned point into account, this study aims at offering a new methodology for making decisions concerning the formation of a portfolio of stocks in the stock market. To meet this end, Multiple-Criteria Decisi...

متن کامل

Generating Optimal Timetabling for Lecturers using Hybrid Fuzzy and Clustering Algorithms

UCTTP is a NP-hard problem, which must be performed for each semester frequently. The major technique in the presented approach would be analyzing data to resolve uncertainties of lecturers’ preferences and constraints within a department in order to obtain a ranking for each lecturer based on their requirements within a department where it is attempted to increase their satisfaction and develo...

متن کامل

A Hybrid Data Clustering Algorithm Using Modified Krill Herd Algorithm and K-MEANS

Data clustering is the process of partitioning a set of data objects into meaning clusters or groups. Due to the vast usage of clustering algorithms in many fields, a lot of research is still going on to find the best and efficient clustering algorithm. K-means is simple and easy to implement, but it suffers from initialization of cluster center and hence trapped in local optimum. In this paper...

متن کامل

Tabu-KM: A Hybrid Clustering Algorithm Based on Tabu Search Approach

  The clustering problem under the criterion of minimum sum of squares is a non-convex and non-linear program, which possesses many locally optimal values, resulting that its solution often falls into these trap and therefore cannot converge to global optima solution. In this paper, an efficient hybrid optimization algorithm is developed for solving this problem, called Tabu-KM. It gathers the ...

متن کامل

Proposing a Novel Cost Sensitive Imbalanced Classification Method based on Hybrid of New Fuzzy Cost Assigning Approaches, Fuzzy Clustering and Evolutionary Algorithms

In this paper, a new hybrid methodology is introduced to design a cost-sensitive fuzzy rule-based classification system. A novel cost metric is proposed based on the combination of three different concepts: Entropy, Gini index and DKM criterion. In order to calculate the effective cost of patterns, a hybrid of fuzzy c-means clustering and particle swarm optimization algorithm is utilized. This ...

متن کامل

GROUND MOTION CLUSTERING BY A HYBRID K-MEANS AND COLLIDING BODIES OPTIMIZATION

Stochastic nature of earthquake has raised a challenge for engineers to choose which record for their analyses. Clustering is offered as a solution for such a data mining problem to automatically distinguish between ground motion records based on similarities in the corresponding seismic attributes. The present work formulates an optimization problem to seek for the best clustering measures. In...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004